Goto

Collaborating Authors

 Düsseldorf Region


Ball x Pit on mobile, Piece by Piece x2 and other new indie games worth checking out

Engadget

Welcome to our latest roundup of what's going on in the indie game space. A bunch of intriguing games arrived this week, including a mobile port of one of the most absorbing things I've played in years and two completely different titles with the same name. Let's get things started with a look at a few projects that were featured in the latest edition of the Future Games Show . To recharge your weapons and systems, you have to plug a cable that trails behind your spaceship into a socket. While you're plugged in, your movement is restricted by the length of the tether, but you gain more firepower.





MassSpecGym: A benchmark for the discovery and identification of molecules Roman Bushuiev

Neural Information Processing Systems

Despite decades of progress in machine learning applications for predicting molecular structures from MS/MS spectra, the development of new methods is severely hindered by the lack of standard datasets and evaluation protocols. To address this problem, we propose MassSpecGym - the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data.



Explainable and Efficient Randomized Voting Rules

Neural Information Processing Systems

With a rapid growth in the deployment of AI tools for making critical decisions (or aiding humans in doing so), there is a growing demand to be able to explain to the stakeholders how these tools arrive at a decision.



Automatic debiased machine learning and sensitivity analysis for sample selection models

Bjelac, Jakob, Chernozhukov, Victor, Klotz, Phil-Adrian, Kueck, Jannis, Schmitz, Theresa M. A.

arXiv.org Machine Learning

In this paper, we extend the Riesz representation framework to causal inference under sample selection, where both treatment assignment and outcome observability are non-random. Formulating the problem in terms of a Riesz representer enables stable estimation and a transparent decomposition of omitted variable bias into three interpretable components: a data-identified scale factor, outcome confounding strength, and selection confounding strength. For estimation, we employ the ForestRiesz estimator, which accounts for selective outcome observability while avoiding the instability associated with direct propensity score inversion. We assess finite-sample performance through a simulation study and show that conventional double machine learning approaches can be highly sensitive to tuning parameters due to their reliance on inverse probability weighting, whereas the ForestRiesz estimator delivers more stable performance by leveraging automatic debiased machine learning. In an empirical application to the gender wage gap in the U.S., we find that our ForestRiesz approach yields larger treatment effect estimates than a standard double machine learning approach, suggesting that ignoring sample selection leads to an underestimation of the gender wage gap. Sensitivity analysis indicates that implausibly strong unobserved confounding would be required to overturn our results. Overall, our approach provides a unified, robust, and computationally attractive framework for causal inference under sample selection.


Understanding Syntactic Generalization in Structure-inducing Language Models

Arps, David, Sajjad, Hassan, Kallmeyer, Laura

arXiv.org Artificial Intelligence

Structure-inducing Language Models (SiLM) are trained on a self-supervised language modeling task, and induce a hierarchical sentence representation as a byproduct when processing an input. SiLMs couple strong syntactic generalization behavior with competitive performance on various NLP tasks, but many of their basic properties are yet underexplored. In this work, we train three different SiLM architectures from scratch: Structformer (Shen et al., 2021), UDGN (Shen et al., 2022), and GPST (Hu et al., 2024b). We train these architectures on both natural language (English, German, and Chinese) corpora and synthetic bracketing expressions. The models are then evaluated with respect to (i) properties of the induced syntactic representations (ii) performance on grammaticality judgment tasks, and (iii) training dynamics. We find that none of the three architectures dominates across all evaluation metrics. However, there are significant differences, in particular with respect to the induced syntactic representations. The Generative Pretrained Structured Transformer (GPST; Hu et al. 2024) performs most consistently across evaluation settings, and outperforms the other models on long-distance dependencies in bracketing expressions. Furthermore, our study shows that small models trained on large amounts of synthetic data provide a useful testbed for evaluating basic model properties.